Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences
● The Royal Society
All preprints, ranked by how well they match Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences's content profile, based on 12 papers previously published here. The average preprint has a 0.00% match score for this journal, so anything above that is already an above-average fit. Older preprints may already have been published elsewhere.
Zhu, J.; Rivera, K.; Baron, D.
Show abstract
Fast testing can help mitigate the coronavirus disease 2019 (COVID-19) pandemic. Despite their accuracy for single sample analysis, infectious diseases diagnostic tools, like RT-PCR, require substantial resources to test large populations. We develop a scalable approach for determining the viral status of pooled patient samples. Our approach converts group testing to a linear inverse problem, where false positives and negatives are interpreted as generated by a noisy communication channel, and a message passing algorithm estimates the illness status of patients. Numerical results reveal that our approach estimates patient illness using fewer pooled measurements than existing noisy group testing algorithms. Our approach can easily be extended to various applications, including where false negatives must be minimized. Finally, in a Utopian world we would have collaborated with RT-PCR experts; it is difficult to form such connections during a pandemic. We welcome new collaborators to reach out and help improve this work!
McKinley, T. J.; Williamson, D. B.; Xiong, X.; Salter, J. M.; Challen, R.; Danon, L.; Youngman, B. D.; McNeall, D.
Show abstract
Calibration of complex stochastic infectious disease models is challenging. These often have high-dimensional input and output spaces, with the models exhibiting complex, non-linear dynamics. Coupled with a paucity of necessary data, this results in a large number of non-ignorable hidden states that must be handled by the inference routine. Likelihood-based approaches to this missing data problem are very flexible, but challenging to scale, due to having to monitor and update these hidden states. Methods based on simulating the hidden states directly from the model-of-interest have an advantage that they are often more straightforward to code, and thus are easier to implement and adapt in real-time. However, these often require evaluating very large numbers of simulations, rendering them infeasible for many large-scale problems. We present a framework for using emulation-based methods to calibrate a large-scale, stochastic, age-structured, spatial meta-population model of COVID-19 transmission in England and Wales. By embedding a model discrepancy process into the simulation model, and combining this with particle filtering, we show that it is possible to calibrate complex models to high-dimensional data by emulating the log-likelihood surface instead of individual data points. The use of embedded model discrepancy also helps to alleviate other key challenges, such as the introduction of infection across space and time. We conclude with a discussion of major challenges remaining and key areas for future work.
Perley, A. S.; Martinez, M. E.; Mercadante, T.; Liu, S.; Coleman, T. P.
Show abstract
The dynamics of heartbeat intervals provide important insights into cardiovascular and autonomic nervous system function. Conventional analytical approaches often use fixed-window averaging, which can obscure rapid changes and reduce temporal resolution. Point process models address this limitation by operating in continuous time, enabling more precise characterization of heartbeat variability. A landmark example is the history-dependent inverse Gaussian (IG) point process model of Barbieri et al. (2005), which captures temporal dependencies in heartbeat timing. However, the nonconvex likelihood of the IG model complicates parameter estimation, requiring careful initialization and adding computational burden. In this work, we introduce a convex alternative: a history-dependent gamma generalized linear model (GLM) for heartbeat dynamics. Applied to a tilt-table dataset, our approach yields accurate and robust heart rate estimation. We further extend the model to two more applications: (1) sequential prediction of interbeat intervals, outperforming common machine learning algorithms, and (2) computation of information-theoretic measures demonstrating its utility in quantifying the influence of cardiac medications on heartbeat dynamics.
Hillsley, A.; Stein, J.; Tillberg, P. W.; Stern, D. L.; Funke, J.
Show abstract
We address the problem of inferring the number of independently blinking fluorescent light emitters, when only their combined intensity contributions can be observed at each timepoint. This problem occurs regularly in light microscopy of objects that are smaller than the diffraction limit, where one wishes to count the number of fluorescently labelled subunits. Our proposed solution directly models the photo-physics of the system, as well as the blinking kinetics of the fluorescent emitters as a fully differentiable hidden Markov model. Given a trace of intensity over time, our model jointly estimates the parameters of the intensity distribution per emitter, their blinking rates, as well as a posterior distribution of the total number of fluorescent emitters. We show that our model is consistently more accurate and increases the range of countable subunits by a factor of two compared to current state-of-the-art methods, which count based on autocorrelation and blinking frequency. Furthermore, we demonstrate that our model can be used to investigate the effect of blinking kinetics on counting ability, and therefore can inform experimental conditions that will maximize counting accuracy.
Verstraeten, B.; Lootens, S.; Van Den Abeele, R.; Van Nieuwenhuize, V.; Okenov, A.; Hendrickx, S.; Santos bezzera, A.; Nezlobinskii, T.; Kappadan, V.; Handa, B. S.; Ng, F. S.; Duytschaever, M.; Vandersickel, N.
Show abstract
Phase mapping is a widespread method for identifying rotational electrical activity sustaining cardiac arrhythmias. However, conventional implementations assume that the cardiac phase map is continuous, leading to ill-defined phase indices in regions affected by functional conduction block, fibrosis, or anatomical boundaries. These regions of discontinuous or undefined phase, termed phase defects, lead to both false positive and false negative detections of rotational drivers. This work introduces an improved phase mapping implementation termed extended phase mapping that explicitly detects and accounts for phase defects, enabling robust calculation of the phase index around them. Extended phase mapping is applied to (1) simulated excitation patterns using the Fenton-Karma model, (2) experimental optical mapping data of rat ventricular fibrillation, and (3) a clinical CARTO activation map of atrial tachycardia. Across all datasets, the extended approach eliminates erroneous detections and resolves previously missed rotations. Our results demonstrate that proper treatment of phase defects yields a unified and physiologically consistent characterization of all rotational drivers including near-complete and anatomical reentries. Therefore, we propose replacing the classical notion of phase singularities with critical phase defects as the fundamental entities governing rotational dynamics in cardiac tissue. Author summaryDetecting rotating electrical activity in the heart is crucial for understanding and treating abnormal heart rhythms. A common method, phase mapping, assigns a timing phase to each region of the heart to identify these rotations. However, in regions affected by scars, blocked conduction, or anatomical boundaries, the phase can become undefined or discontinuous. These so-called phase defects make current methods unreliable, causing false detections or missed rotations. In this study, we introduce an extended method that explicitly identifies phase defects and calculates phase indices around them. We test this approach using computer simulations, experimental recordings from animal hearts, and clinical heart-mapping data. Across all datasets, it eliminates false detections and reveals previously overlooked rotational activity. By properly accounting for phase defects, the extended phase mapping method provides a more reliable and complete characterization of heart rhythms, offering a physiologically meaningful framework for studying electrical dynamics in cardiac tissue.
Zabeti, H.; Dexter, N.; Lau, I.; Unruh, L.; Adcock, B.; Chindelevitch, L.
Show abstract
Group testing, the testing paradigm which combines multiple samples within a single test, was introduced in 1943 by Robert Dorfman. Since its original proposal for syphilis screening, group testing has been applied in domains such as fault identification in electrical and computer networks, machine learning, data mining, and cryptography. The SARS-CoV-2 pandemic has led to proposals for using group testing in its original context of identifying infected individuals in a population with few tests. Studies suggest that non-adaptive group testing - in which all the tests are determined in advance - for SARS-CoV-2 could help save 20% to 90% of tests depending on the prevalence. However, no systematic approach for comparing different non-adaptive group testing strategies currently exists. In this paper we develop a software platform for evaluating non-adaptive group testing strategies in both a noiseless setting and in the presence of realistic noise sources, modelled on published experimental observations, which makes them applicable to polymerase chain reaction (PCR) tests, the dominant type of tests for SARS-CoV-2. This modular platform can be used with a variety of group testing designs and decoding algorithms. We use it to evaluate the performance of near-doubly-regular designs and a decoding algorithm based on an integer linear programming formulation, both of which are known to be optimal in some regimes. We find savings between 40% and 91% of tests for prevalences up to 10% when a small error (below 5%) is allowed. We also find that the performance degrades gracefully with noise. We expect our modular, user-friendly, publicly available platform to facilitate empirical research into non-adaptive group testing for SARS-CoV-2.
Brenlla-Lopez, A.; Deen, L.; Annibale, P.
Show abstract
Since the advent of stochastic localization microscopy approaches in 2006, the number of studies employing this strategy to investigate the sub-diffraction limit features of fluorescently labeled structures in biology, biophysics and solid state samples has increased exponentially. Underpinning all these approaches is the notion that the position of single molecules can be determined to high precision, provided enough photons are collected. The determination of exactly how precisely, has been demanded to formulas that try to approximate the so-called Cramer Rao Lower Bound based on input parameters such as the number of photons collected from the molecules, or the size of the camera pixel. These estimates should however be matched to the experimental localization precision, which can be easily determined if instead of looking at single beads, we study the distance between a pair. We revisit here a few key works, observing how these theoretical determinations tend to routinely underestimate the experimental localization precision, of the order of a factor two. A software-independent metric to determine, based on each individual setup, the appropriate value to set on the localization error of individual emitters is provided.
Steyn, N.; Parag, K. V.
Show abstract
The instantaneous reproduction number (Rt) is a key measure of the rate of spread of an infectious disease. Correctly quantifying uncertainty in Rt estimates is crucial for making well-informed decisions. Popular Rt estimators leverage smoothing techniques to distinguish signal from noise. Examples include EpiEstim and EpiFilter, which are both controlled by a "smoothing parameter" that is traditionally selected by users. We demonstrate that the values of these smoothing parameters are unknown, vary markedly with epidemic dynamics, and show that data-driven smoothing is crucial for accurate uncertainty quantification of Rt estimates. We derive model likelihoods for the smoothing parameters in both EpiEstim and EpiFilter and develop a Bayesian framework to automatically marginalise these parameters when fitting to epidemiological time-series data. This yields novel marginal posterior predictive distributions which prove integral to rigorous model evaluation. Applying our methods, we find that default parameterisations of these widely-used estimators can negatively impact Rt inference, delaying detection of epidemic growth, and misrepresenting uncertainty (typically producing overconfident estimates), with implications for public health decision-making. Our extensions mitigate these issues, provide a principled approach to uncertainty quantification, improve the robustness of real-time Rt inference, and facilitate model comparison using observable quantities.
Rehms, R.; Ellenbach, N.; Rehfuess, E. A.; Burns, J.; Mansmann, U.; Hoffmann, S.
Show abstract
Coronavirus disease (COVID-19) has highlighted both the shortcomings and value of modelling infectious diseases. Infectious disease models can serve as critical tools to predict the development of cases and associated healthcare demand and to determine the set of non-pharmaceutical interventions (NPI) that is most effective in slowing the spread of the infectious agent. Current approaches to estimate NPI effects typically focus on relatively short time periods and either on the number of reported cases, deaths, intensive care occupancy or hospital occupancy as a single indicator of disease transmission. In this work, we propose a Bayesian hierarchical model that integrates multiple outcomes and complementary sources of information in the estimation of the true and unknown number of infections while accounting for time-varying under-reporting and weekday-specific delays in reported cases and deaths, allowing us to estimate the number of infections on a daily basis rather than having to smooth the data. Using information from the entire course of the pandemic, we account for the spread of variants of concern, seasonality and vaccination coverage in the model. We implement a Markov Chain Monte Carlo algorithm to conduct Bayesian inference and estimate the effect of NPIs for 20 European countries. The approach shows good performance on simulated data and produces posterior predictions that show a good fit to reported cases, deaths, hospital and intensive care occupancy.
Gressani, O.; Faes, C.; Hens, N.
Show abstract
In epidemic models, the effective reproduction number is of central importance to assess the transmission dynamics of an infectious disease and to orient health intervention strategies. Publicly shared data during an outbreak often suffers from two sources of misreporting (underreporting and delay in reporting) that should not be overlooked when estimating epidemiological parameters. The main statistical challenge in models that intrinsically account for a misreporting process lies in the joint estimation of the time-varying reproduction number and the delay/underreporting parameters. Existing Bayesian approaches typically rely on Markov chain Monte Carlo (MCMC) algorithms that are extremely costly from a computational perspective. We propose a much faster alternative based on Laplacian-P-splines (LPS) that combines Bayesian penalized B-splines for flexible and smooth estimation of the time-varying reproduction number and Laplace approximations to selected posterior distributions for fast computation. Assuming a known generation interval distribution, the incidence at a given calendar time is governed by the epidemic renewal equation and the delay structure is specified through a composite link framework. Laplace approximations to the conditional posterior of the spline vector are obtained from analytical versions of the gradient and Hessian of the log-likelihood, implying a drastic speed-up in the computation of posterior estimates. Furthermore, the proposed LPS approach can be used to obtain point estimates and approximate credible intervals for the delay and reporting probabilities. Simulation of epidemics with different combinations for the underreporting rate and delay structure (one-day, two-day and weekend delays) show that the proposed LPS methodology delivers fast and accurate estimates outperforming existing methods that do not take into account underreporting and delay patterns. Finally, LPS is illustrated on two real case studies of epidemic outbreaks.
Mieskolainen, M.; Bainbridge, R.; Buchmueller, O.; Lyons, L.; Wardle, N.
Show abstract
AO_SCPLOWBSTRACTC_SCPLOWThe determination of the infection fatality rate (IFR) for the novel SARS-CoV-2 coronavirus is a key aim for many of the field studies that are currently being undertaken in response to the pandemic. The IFR together with the basic reproduction number R0, are the main epidemic parameters describing severity and transmissibility of the virus, respectively. The IFR can be also used as a basis for estimating and monitoring the number of infected individuals in a population, which may be subsequently used to inform policy decisions relating to public health interventions and lockdown strategies. The interpretation of IFR measurements requires the calculation of confidence intervals. We present a number of statistical methods that are relevant in this context and develop an inverse problem formulation to determine correction factors to mitigate time-dependent effects that can lead to biased IFR estimates. We also review a number of methods to combine IFR estimates from multiple independent studies, provide example calculations throughout this note and conclude with a summary and "best practice" recommendations. The developed code is available online.
Amselem, E.; Broadwater, B.; Havermark, T.; Johansson, M.; Elf, J.
Show abstract
Sub-ms 3D tracking of individual molecules in living cells is an important goal for microscopy since it will enable measurements at the scale of diffusion limited macromolecular interactions. Here, we present a 3D tracking principle based on the true excitation point spread function and cross-entropy minimization for position localization of moving fluorescent reporters that approaches the relevant regime. When tested on beads moved on a stage, we reached 67nm lateral and 109nm axial precision with a time resolution of 0.84 ms at a photon count rate of 60kHz, coming close to the theoretical and simulated predictions. A critical step in the implementation was a new method for microsecond 3D PSF positioning that combines 3D holographic beam shaping and electro-optical deflection. For the analysis of tracking data, a new point estimator for diffusion was derived and evaluated by a detailed simulation of the 3D tracking principle applied to a fictive reaction-diffusion process in an E. coli-like geometry. Finally, we successfully applied these methods to track the Trigger Factor protein in living bacterial cells. Overall our results show that it is possible to reach sub-millisecond live-cell single-molecule tracking, but that it is still hard to resolve state transitions based on diffusivity at this time scale.
Liu, S.; Pani, S.; Khan, S. A.; Becerra, F. E.; Lidke, K. A.
Show abstract
According to Rayleighs criterion, two incoherent emitters with a separation below the diffraction limit are not resolvable with a conventional fluorescence microscope. One method of Super-Resolution Microscopy (SRM) circumvents the diffraction-limited resolution by precisely estimating the position of spatiotemporally independent emitters. However, these methods of SRM techniques are not optimal for estimating the separation of two simultaneously excited emitters. Recently, a number of detection methods based on modal imaging have been developed to achieve the quantum Cramer-Rao lower bound (QCRB) to estimate the separations between two nearby emitters. The QCRB determines the minimum achievable precision for all possible detection methods. Current modal imaging techniques assume a scalar field generated from a point source, such as a distant source from an optical fiber or a pinhole. However, for fluorescently labeled samples, point emitters are single fluorophores that are modeled as dipole emitters and, in practice, are often freely rotating. Dipole radiation must be described by vectorial theory, and the assumption of a scalar field no longer holds. Here, we present a method to numerically calculate the QCRB for measuring the separation of two dipole emitters, incorporating the vectorial theory. Furthermore, we propose a near-quantum optimal detection scheme based on one of the modal imaging techniques, super-localization by image inversion interferometry (SLIVER), for estimating the separation of two freely rotating dipoles. In the proposed method, we introduce a vortex wave plate before the SLIVER detection to separate the radial and azimuthal components of the dipole radiation. With numerical simulations, we demonstrated that our method achieves non-divergent precision at any separation between two dipole emitters. We investigated several practical effects relevant to experimental measurements in super-resolution microscopy, including numerical aperture, detection bandwidth, number of estimation parameters, background, and misalignment on separation estimation. Our proposed measurement provides a near quantum-limited detection scheme for measuring the separation of two freely-rotating dipole emitters, such as fluorescently tagged molecules, which are commonly used in super-resolution microscopy.
Beregi, S.; Parag, K.
Show abstract
Deciding when to enforce or relax non-pharmaceutical interventions (NPIs) based on real-time out-break surveillance data is a central challenge in infectious disease epidemiology. Reporting delays and infection under-ascertainment, which characterise practical surveillance data, can misinform decision-making, prompting mistimed NPIs that fail to control spread or permitting deleterious epidemic peaks that overload healthcare capacities. To mitigate these risks, recent studies propose more data-insensitive strategies that trigger NPIs at predetermined times or infection thresholds. However, these strategies often increase NPI durations, amplifying their substantial costs to liveli-hood and life-quality. We develop a novel model-predictive control algorithm that optimises NPI decisions. We jointly minimise the cumulative risks and costs of interventions of different stringency over stochastic epidemic projections. Our algorithm is among the earliest to realistically incorporate uncertainties underlying both the generation and surveillance of infections. We find, except under extremely delayed reporting, that our projective approach outperforms data-insensitive strategies and show that earlier decisions strikingly improve real-time control with reduced NPI costs. Moreover, we expose how surveillance quality, disease growth and NPI frequency intrinsically limit our ability to flatten epidemic peaks or dampen endemic oscillations and reveal why this potentially makes Ebola virus more controllable than SARS-CoV-2. Our algorithm provides a general framework for guiding optimal NPI decisions ahead-of-time and identifying the key factors limiting practical epidemic control. Author summaryIn our work, we tackle the challenge of determining the best time to enforce or relax non-pharmaceutical interventions (NPIs), such as mandatory mask wearing, social distancing or quarantine, to manage the spread of infectious diseases. Making an optimal decision on NPIs requires balancing the risks and the burden of prevalent infections on the healthcare systems against the costs of restrictive measures to livelihood and life-quality. Real-world data used to inform these decisions can often be unreliable due to delays in reporting and missed cases. This can lead to NPIs being implemented too late or too soon, and as such, failing to contain the outbreak or unnecessarily disrupting daily life. We introduced a novel algorithm that projects future scenarios based on current data to optimise NPI decisions across interventions with different overall stringency and costs. Our results show that our method can effectively reduce the duration and cost of NPIs while better controlling the spread of infections than more traditional approaches of having fixed thresholds or NPI schedules. Our approach optimises these decisions even when data is uncertain and is a versatile tool that can adapt to changes in the epidemic dynamics, such as the appearance of new variants. Moreover, we highlight how the quality of surveillance, the growth rate of the disease, and the frequency of NPIs play crucial roles in managing outbreaks and why this potentially makes Ebola virus more controllable than SARS-CoV-2.
Williamson, D. B.; McKinley, T.; Xiong, X.; Salter, J. M.; Challen, R.; Danon, L.; Youngman, B. D.; McNeall, D.
Show abstract
AO_SCPLOWBSTRACTC_SCPLOWInfectious disease models are used to predict the spread and impact of outbreaks of a disease. Like other complex models, they have parameters that need to be calibrated, and structural discrepancies from the reality that they simulate that should be accounted for in calibration and prediction. Whilst Uncertainty Quantification (UQ) techniques have been applied to infectious disease models before, they were not routinely used to inform policymakers in the UK during the COVID-19 pandemic. In this paper, we will argue that during a fast moving pandemic, models and policy are changing on timescales that make traditional UQ methods impractical, if not impossible to implement. We present an alternative formulation to the calibration problem that embeds model discrepancy within the structure of the model, and appropriately assimilates data within the simulation. We then show how UQ can be used to calibrate the model in real-time to produce disease trajectories accounting for parameter uncertainty and model discrepancy. We apply these ideas to an age-structured COVID-19 model for England and demonstrate the types of information it could have produced to feed into policy support prior to the lockdown of March 2020.
Britton, T.
Show abstract
The purpose of the present paper is to present simple estimation and prediction methods for basic quantities in an emerging epidemic like the ongoing covid-10 pandemic. The simple methods have the advantage that relations between basic quantities become more transparent, thus shedding light to which quantities have biggest impact on predictions, with the additional conclusion that uncertainties in these quantities carry over to high uncertainty also in predictions. A simple non-parametric prediction method for future cumulative case fatalities, as well as future cumulative incidence of infections (assuming a given infection fatality risk f), is presented. The method uses cumulative reported case fatalities up to present time as input data. It is also described how the introduction of preventive measures of a given magnitude{rho} will affect the two incidence predictions, using basic theory of epidemic models. This methodology is then reversed, thus enabling estimation of the preventive magnitude{rho} , and of the resulting effective reproduction number RE. However, the effects of preventive measures only start affecting case fatalities some 3-4 weeks later, so estimates are only available after this time has elapsed. The methodology is applicable in the early stage of an outbreak, before, say, 10% of the community have been infected. Beside giving simple estimation and prediction tools for an ongoing epidemic, another important conclusion lies in the observation that the two quantities f (infection fatality risk) and{rho} (the magnitude of preventive measures) have very big impact on predictions. Further, both of these quantities currently have very high uncertainty: current estimates of f lie in the range 0.2% up to 2% ([9], [7]), and the overall effect of several combined preventive measures is clearly very uncertain. The two main findings from the paper are hence that, a) any prediction containing f, and/or some preventive measures, contain a large amount of uncertainty (which is usually not acknowledged well enough), and b) obtaining more accurate estimates of in particular f, should be highly prioritized. Seroprevalence testing of random samples in a community where the epidemic has ended are urgently needed.
Albrecht, A.; Pfennig, D.; Nowak, J.; Matis, R.; Schaks, M.; Hafi, N.; Rottner, K.; Walla, P. J.
Show abstract
Super-resolution optical fluctuation imaging (SOFI) is a technique that uses the amplitude of fluorescence correlation data for improved resolution of fluorescence images. Here, we explore if also the amplitude of superresolution by polarisation demodulation (SPoD) data can be used to gain additional information about the underlying structures. Highly organized experimental as well a simulated actin filament data demonstrate a principle information gain from this approach. In addition, we explored theoretically the benefits of analyzing the entire 3D-polarization information instead of only 2D-projections thereof. Due to fundamental principles, the probability of finding parallel orientations is approaching zero in 3D-SPoD in contrast to 2D-approaches. Using the modulation-amplitude based analysis we explored systematically simulated 3D-single molecules data (for which the true structures are known) under different conditions that are typically observed in experiments. We found that this approach can significantly improve the distinction, reconstruction and localization. In addition, these approaches are less sensitive to uncertainties in the knowledge about the true experimental point-spread-function (PSF) used for reconstruction compared to approaches using non-modulated data. Finally, they can effectively remove higher levels of non-modulated back-ground intensity.
Aronis, J. M.; Ye, Y.; Espino, J.; Michaels, M.; Hochheiser, H. M.; Cooper, G.
Show abstract
Tracking known influenza-like illnesses, such as influenza, is an important problem in public health and clinical medicine. The problem is complicated by the clinical similarity and co-occurrence of many of these illnesses. Additionally, detecting a new or reemergent disease, such as COVID-19, is of paramount importance as recent history has shown. This paper describes the testing of a system that tracks known influenza-like illnesses and can detect the presence of a novel disease. (This manuscript is a preprint and has not been peer reviewed.)
Cui, J.; Haddadan, A.; Haque, A. S. M. A.-U.; Adhikari, B.; Vullikanti, A.; Prakash, B. A.
Show abstract
One of the most significant challenges in combating against the spread of infectious diseases was the difficulty in estimating the true magnitude of infections. Unreported infections could drive up disease spread, making it very hard to accurately estimate the infectivity of the pathogen, therewith hampering our ability to react effectively. Despite the use of surveillance-based methods such as serological studies, identifying the true magnitude is still challenging. This paper proposes an information theoretic approach for accurately estimating the number of total infections. Our approach is built on top of Ordinary Differential Equations (ODE) based models, which are commonly used in epidemiology and for estimating such infections. We show how we can help such models to better compute the number of total infections and identify the parametrization by which we need the fewest bits to describe the observed dynamics of reported infections. Our experiments on COVID-19 spread show that our approach leads to not only substantially better estimates of the number of total infections but also better forecasts of infections than standard model calibration based methods. We additionally show how our learned parametrization helps in modeling more accurate what-if scenarios with non-pharmaceutical interventions. Our approach provides a general method for improving epidemic modeling which is applicable broadly.
PRASAD, J.
Show abstract
The primary data for Covid-19 pandemic is in the form of time series for the number of confirmed, recovered and dead cases. This data is updated every day and is available for most countries from multiple sources such as [Gar20b, iD20]. In this work we present a two step procedure for model fitting to Covid-19 data. In the first step, time dependent transmission coefficients are constructed directly from the data and, in the second step, measures of those (minimum, maximum, mean, median etc.,) are used to set priors for fitting models to data. We call this approach a "data driven approach" or "data first approach". This scheme is complementary to Bayesian approach and can be used with or without that for parameter estimation. We use the procedure to fit a set of SIR and SIRD models, with time dependent contact rate, to Covid-19 data for a set of most affected countries. We find that SIR and SIRD models with constant transmission coefficients cannot fit Covid-19 data for most countries (mainly because social distancing, lockdown etc., make those time dependent). We find that any time dependent contact rate decaying with time can help to fit SIR and SIRD models for most of the countries. We also present constraints on transmission coefficients and basic reproduction number [Formula], as well as effective reproduction number [Formula]. The main contributions of our work are as follows. (1) presenting a two step procedure for model fitting to Covid-19 data (2) constraining transmission coefficients as well as [Formula] and [Formula], for a set of countries and (3) releasing a python package PyCov19 [Pra20b] that can used to fit a class of compartmental models, with time varying coefficients, to Covid-19 data.